perm filename SCCPP.TIM[TIM,LSP]7 blob sn#681173 filedate 1982-09-30 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00010 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	SCCPP	Multics
C00005 00003	SCCPP	Rutgers
C00011 00004	SCCPP 	SAIL
C00014 00005	SCCPP	Franz
C00019 00006	SCCPP 	Texas
C00022 00007	SCCPP	Multics
C00023 00008	UTLISP 5.1
C00025 00009	LM-2
C00027 00010	 InterLisp on SCORE
C00029 ENDMK
C⊗;
SCCPP	Multics
∂06-Apr-81  1931	Bernard S. Greenberg       <Greenberg at MIT-Multics> 	Re: Timing benchmark
Date:     6 April 1981 2147-est
From:     Bernard S. Greenberg       <Greenberg at MIT-Multics>
Subject:  Re: Timing benchmark
To:       RPG at SU-AI
Cc:       "@LSPTRN.DIS[P,DOC]" at SU-AI

I ran this program three times on MIT-Multics. Heres what it said.

Runtime  (secs)   86.4  86.4  86.8
gctime   (secs)    2.4  3.8    2.5     (41 users out of max
                                        set at 120)

Before we accept that this implementation is really 60 times
slower than MACLISP on SAIL, I would like to point out that
the unmodified program, sent in the mail, also ran on AI-ITS
MACLISP, also compiled, for FIVE SOLID WALL-CLOCK MINUTES (as
it did on Multics) without opening its mouth, but I quit it
before it finished on AI (several times). The KA-10 is reported
to be 2 or 3 times slower (instruction rate) than Multics.
The KL-10 at sail, given all benefit of the doubt, is NOT
TWO ORDERS OF MAGNITUDE faster than the AI KA-10.

Has anyone else encountered difficulty in making this program run
in small number of seconds of CPU time?  Perhaps there is some
subtle conversion problem I missed in my perusal of this program?
(I made no modifications other than converting the characters to
lower-case, and on Multics, macroing *catch/*throw to catch/throw).

Did you compile these functions? If not, please compile them and
send those results too. Thanks.
			-rpg-
SCCPP	Rutgers
∂06-Apr-81  2008	HEDRICK at RUTGERS 	Re: Timing benchmark     
Date:  6 Apr 1981 2307-EST
From: HEDRICK at RUTGERS
Subject: Re: Timing benchmark   
To: RPG at SU-AI
In-Reply-To: Your message of 6-Apr-81 1602-EST

I translated your benchmark to R/UCI Lisp.  The only non-obvious
translation was due to the fact that in R/UCI Lisp (as in all Lisp 1.6
derivatives) there can only be 5 arguments to compiled functions.
Fortunately the arguments beyond the 4th were always NIL in your case,
so I just eliminated them.  (R/UCI Lisp has a special kind of function,
as LEXPR, that allows more than 5 arguments.  However its use did not
seem necessary in this case.)

Totals, including GC:

R/UCI Lisp (with free space expanded by 50K):
  interpreted:  15.0
  compiled:      4.6
  ", NOUUO:      3.2  (of which .6 is GC)

By NOUUO I mean that linkage between compiled functions is by direct
PUSHJ, not going through the intepreter.  This makes tracing and
breaking impossible.  (In Elisp we will use a protocol that does not
have this problem.)  Note that total runtimes are slightly better than
RPG's MACLisp timings.  However more of it is runtime and less is GC.
I conjecture that this will be typical of Lisp programs whose only
arithmetic involves small integers.  MACLisp will produce better
code for small integers, but will have to box and unbox them when returning
from functions or putting them into data structures, causing faster
runtime but more GC time. The first call is no different than others
because in R/UCI Lisp there is no automated expansion.  We had to
explicitly expand free storage by 50K before running.

Elisp (extended addressing Lisp) does not yet have a compiler.
Therefore some of the system functions (e.g. MEMBER, CADR) are running
interpreted.  This slows things down noticably.  To get an idea of how
Elisp is going to work out, I compared it with a copy of R/UCI Lisp in
which the same functions are being interpreted.  [The temptation is
great to simply code these things in assembly language, since there are
really only a few at this point.  However I will attempt to resist this
temptation and continue to compare Elisp with this semi-interpreted
R/UCI Lisp.]  Elisp uses dynamic expansion and contraction of memory.
However there is no apparent difference between the first time and
other times (possibly because memory has contracted to its initial
state by the end).

Elisp:
  interpreted:  28.5  (E-box time, as used by RPG, 20.1 sec.)

R/UCI Lisp (with the same functions interpreted):
  interpreted:  32.6

So Elisp is (as usual) a small amount faster than R/UCI Lisp.  This
suggests that a good prediction for the final version of Elisp is 14
sec. interpreted.  I am not ready to make predictions for Elisp compiled
code, as we don't know how the pager refill problem is going to affect
it.  My guess is that it will be slightly slower than R/UCI Lisp, with
slowdown due to pager refills (from extended addressing) somewhat offset
by the better code generated by Utah's compiler.

Note that I normally report "normal" CPU time, not E-box times as used
by RPG. The E-box times will be noticably smaller in the case of Elisp.
I regard the use of E-box time with Elisp as problematical, since it
is significantly smaller than conventional CPU time, even with 0 load
on the system.  I think this shows that E-box time omits some sort of
overhead that Elisp generates, probably pager refills (though the
documentation says that refills are not counted).  Until someone convinces
me otherwise, I will regard conventional CPU time as a better index of
Elisp's load on the system.  I report CPU times with fairly light (1 to 2)
load averages.

-------

SCCPP 	SAIL

Compiled
(RUNTIME 1.969) 	;ebox
(GCTIME 29.914) 	;ebox
(WRUNTIME 3.44501406) 
(WGCTIME 52.3383193) 
(WHOLINE 55.7833333) 

2592 
(RUNTIME 1.903) 
(GCTIME 2.616) 
(WRUNTIME 3.34783137) 
(WGCTIME 4.6021686) 
(WHOLINE 7.95) 

2592 
(RUNTIME 2.008) 
(GCTIME 2.61) 
(WRUNTIME 3.56552616) 
(WGCTIME 4.6344738) 
(WHOLINE 8.2) 

2592 
(RUNTIME 1.959) 
(GCTIME 2.065) 
(WRUNTIME 3.86217695) 
(WGCTIME 4.0711564) 
(WHOLINE 7.93333334) 

2592 
(RUNTIME 1.904) 
(GCTIME 1.921) 
(WRUNTIME 3.5259259) 
(WGCTIME 3.55740738) 
(WHOLINE 7.0833333) 

Interpreted

2592 
(RUNTIME 17.572) 
(GCTIME 31.451) 
(WRUNTIME 37.176616) 
(WGCTIME 66.54005) 
(WHOLINE 103.716666) 

2592 
(RUNTIME 17.191) 
(GCTIME 2.586) 
(WRUNTIME 39.1158924) 
(WGCTIME 5.8841078) 
(WHOLINE 45.0) 

2592 
(RUNTIME 17.32) 
(GCTIME 2.642) 
(WRUNTIME 35.8194237) 
(WGCTIME 5.4639098) 
(WHOLINE 41.2833333) 

2592 
(RUNTIME 17.414) 
(GCTIME 2.103) 
(WRUNTIME 40.106538) 
(WGCTIME 4.84346205) 
(WHOLINE 44.95) 

2592 
(RUNTIME 17.559) 
(GCTIME 1.955) 
(WRUNTIME 45.3507023) 
(WGCTIME 5.049298) 
(WHOLINE 50.4) 

Compiled
2592 
(RUNTIME 1.948) 
(GCTIME 29.711) 

2592 
(RUNTIME 1.887) 
(GCTIME 2.599) 

2592 
(RUNTIME 1.886) 
(GCTIME 2.565) 

2592 
(RUNTIME 1.892) 
(GCTIME 1.922) 

2592 
(RUNTIME 1.895) 
(GCTIME 1.973) 
SCCPP	Franz
∂11-Apr-81  1001	CSVAX.jkf at Berkeley 	result of pairs benchmark on franz.  
Date: 11 Apr 1981 09:56:26-PST
From: CSVAX.jkf at Berkeley
To: rpg@su-ai
Subject: result of pairs benchmark on franz.

							pair benchmark results
							submitted by j foderaro
							(csvax.jkf@berkeley)
							10 april 81

Here are the results or running the pairs benchmark on Franz Lisp on 
a VAX 11/780 runing Berkeley 4BSD Unix.  The load average was less
than one when the timing was done.  Timing on Unix is done in 60ths of
a second and the time charged to a process includes some of the system
overhead for memory management.  I ran the benchmarks on an unloaded
system to reduce the interference of the memory manager.

The program was run with modification only to the test function
shown below.  Perhaps you should in the future use macros like (cpu-time) 
and (gc-time) and each site can define these macros.

(defun test ()
 ((lambda (t1 x gt)
 (setq x (pairs a b () 'equal () () ()))
	  (setq t1 (- #-Franz (runtime) 
		      #+Franz (car (ptime)) 
		      t1))
	  (setq gt (- #-Franz (status gctime)
		      #+Franz (cadr (ptime)) 
		      gt))
	  (print (length x))
	  (print (list 'runtime
		       (QUOTIENT (FLOAT  (- t1 gt))
				 #-Franz 1000000.
				 #+Franz 60.)))
	  (print (list 'gctime
		       (quotient (float gt) 
				 #-Franz 1000000.
				 #+Franz 60.)))
	  #+Franz (terpri))
  #-Franz (runtime) #+Franz (car (ptime)) 
  ()
  #-Franz (status gctime)
  #+Franz (cadr (ptime))))


---
The size of the compiled file is 2768 bytes.

Here are the results::

Script started on Fri Apr 10 22:16:58 1981
Reval: lisp
Franz Lisp, Opus 34
-> (load 'pairs)
[fasl pairs.o]
t
-> (test)
2592(runtime 7.033333333333333)(gctime 23.53333333333333)
nil
-> (test)
2592(runtime 7.383333333333333)(gctime 4.816666666666667)
nil
-> (test)
2592(runtime 7.283333333333333)(gctime 4.366666666666667)
nil
-> (test)
2592(runtime 7.333333333333333)(gctime 4.666666666666667)
nil
-> (exit)

------
I looked at which functions were being called and it seems just about all this
benchmark does is call 'member' 10,000 times.  I noticed that in our system
'memq' would do as good a job as 'member' so I replaced member by memq
and ran it with these results:

Reval: lisp
Franz Lisp, Opus 34
-> (load 'memqpairs)
[fasl memqpairs.o]
t
-> (test)
2592(runtime 1.683333333333333)(gctime 23.55)
nil
-> (test)
2592(runtime 1.733333333333333)(gctime 4.833333333333333)
nil
-> (test)
2592(runtime 1.766666666666667)(gctime 4.35)
nil
-> (test)
2592(runtime 1.783333333333333)(gctime 4.7)
nil
-> (exit)
script done on Fri Apr 10 22:21:50 1981

SCCPP 	Texas
∂07-Apr-81  2213	Mabry Tyson <ATP.Tyson at UTEXAS-20> 	SCCPP on UCI-Lisp
Date:  8 Apr 1981 0001-CST
From: Mabry Tyson <ATP.Tyson at UTEXAS-20>
Subject: SCCPP on UCI-Lisp
To: rpg at SU-AI

		Results of LISP timing tests for UCI-Lisp

(Times are in the form R+G where R is the runtime (not including GC time)
and G is the GC time.  All times are in seconds.  "Interp" means interpreted,
"Slow" means compiled but using UUO-links, "Fast" means compiled with the UUO
links replaced by direct jumps.)

				   Processor
Program			KL-2060			KI-1060
	   Interp	Slow	  Fast	     Interp	  Slow		Fast

SCCPP:
 Free:
  30000	 14.00+2.78  5.32+2.38	3.38+2.78   53.43+8.58  21.14+11.62 12.77+11.56
	 14.04+2.83  5.37+2.70	3.35+2.71		20.40+11.47 12.63+11.36
				3.34+2.70		20.72+11.50 12.71+11.44

  60000	 14.60+0.58  5.40+0.50	3.35+0.53   52.80+1.42	21.18+1.45  12.81+1.39
	 14.20+0.64  5.44+0.52	3.36+0.53		21.27+1.38  12.34+1.43
	 14.09+0.63  5.37+0.52	3.35+0.52		21.19+1.40  12.93+1.40
	 14.22+0.61  5.35+0.52	3.40+0.53

  Notes: The functions with more than 5 arguments were changed to 5 argument
functions by making the last 3 into a list.  These were involved in less than
3% of the run time so the difference is insignificant.  The timings on the
2060 were with a load average less that 1.  The timings on the KI-10 were with
a moderate load.  The differences in the various timing runs are probably due
to system overhead due to swapping.  The 30K free space examples had about 5
or 6 GC's while the 60K free space examples had one GC each.
-------

SCCPP	Multics
;;; compiled
(test)
2592.
(runtime 3.690232)
(gctime 2.478373)

(test)
2592.
(runtime 3.679003)
(gctime 3.930743)
(test)
2592.
(runtime 3.693353)
(gctime 2.650682)
;;;UTLISP 5.1
Following are the timings for SCCPP for UTLISP 5.1. Because of local
interactive field length (= core size) restrictions, all runs were
submitted as batch jobs. "runtime" does not include "gctime".


Interpreted:

   run#:       #1        #2        #3        #4        #5

runtime:     18.15     18.17     18.25     18.22     18.26
 gctime:      1.34      1.37      1.34      1.35      1.34

Run at 200000 (octal) field length with: 45300. words of free space
                                          3030. words of full space
There were three garbage collections for each run.


Compiled:

   run#:       #1        #2        #3        #4        #5

runtime:      3.95      4.09      4.06      4.10      4.03
 gctime:       .97       .82       .84       .82       .83

Run at 20000 (octal) field length with: 45524. words of free space
                                         2957. words of full space
There were three garbage collections for each run.
-------

;;;LM-2
;; 1. SCCPP

(DEFUN TEST-SCCPP ()
  (LENGTH (TIMING "SCCPP"
		  (PAIRS A B () 'EQUAL () () ()))))

;; Compiled:  9.3 seconds.
;; Interpreted:  116.3 seconds.

;; This test seems to suffer from an incredibly cretinously written NCONC.
;; After adding NCONC optimizer to compiler (which will exist in 211):
;; Compiled:  7.9 seconds.

∂04-Aug-82  1052	HIC at SCRC-TENEX 	[DLA: forwarded]
Date: Wednesday, 4 August 1982  12:26-EDT
From: HIC at SCRC-TENEX
To: rpg at sail
Subject: [DLA: forwarded]

Date: Saturday, 31 July 1982, 09:46-EDT
From: David L. Andre <DLA>
To:   HIC
cc:   DLA

While I was editing UCADR, I decided to microcode a two-argument NCONC.
With this microcoded NCONC, the SCCPP benchmark radically improved:

;; Old results:
;; Compiled:  9.3 seconds.
;; Interpreted:  116.3 seconds.

;; This test seems to suffer from an incredibly cretinously written NCONC.
;; After adding NCONC optimizer to compiler (which will exist in 211):
;; Compiled:  7.9 seconds.

;; With microcoded NCONC (System 211, UCADR 920)
;; Compiled: 6.0 seconds.

You may wish to forward this to the appropriate people.

;;; InterLisp on SCORE
∂30-Sep-82  1218	James Bennett <csd.Bennett at SU-SCORE> 	new timings   
Date: 30 Sep 1982 1207-PDT
From: James Bennett <csd.Bennett at SU-SCORE>
Subject: new timings
To: rpg at SU-AI
Stanford-Phone: (415) 497-2225
Home-Phone: (415) 322-2233

Dick,
	Here they are:
6←(FLENGTH (TIME (PAIRS A B NIL 'EQUAL) 1 0]

collecting lists
3234, 10402 free cells

collecting lists
2842, 10010 free cells

collecting lists
3984, 10128 free cells

collecting lists
4052, 10196 free cells

collecting lists
16161, 16161 free cells
51493 conses
4.845 seconds
22.672 seconds, real time
2592
7←USE 4 FOR 1 IN 6

collecting lists
25873, 25873 free cells

collecting lists
9746, 10258 free cells

collecting lists
15907, 15907 free cells

collecting lists
6358, 10454 free cells

collecting lists
3642, 10298 free cells

collecting lists
4118, 10262 free cells

collecting lists
5956, 10052 free cells

collecting lists
35319, 35319 free cells

collecting lists
11874, 11874 free cells

collecting lists
38897, 38897 free cells

collecting lists
12308, 12308 free cells
205972/4 = 51493 conses
18.741/4 = 4.68525 seconds
200.712 seconds, real time
2592
10←DRIBBLE]
-------